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Abstract 

There has been growing interest in recent years in Q-matrix based 
cognitive diagnosis models. Parameter estimation and respondent classi- 
fication under these models may suffer due to identifiability issues. Non- 
identifiability can be described by a partition separating attribute profiles 
into groups of those with identical likelihoods. Marginal identifiability 
concerns the identifiability of individual attributes. Maximum likelihood 
estimation of the proportion of respondents within each equivalence class 
is consistent, making possible a new measure of assessment quality report- 
ing the proportion of respondents for whom each individual attribute is 
marginally identifiable. Arising from this is a new posterior-based classi- 
fication method adjusting for non-identifiability. 

Keywords: CDM, diagnostic classification, DINA, DINO, NIAD-DINA, 
Q-matrix, consistency, identifiability 

1 Introduction 

Diagnostic assessments are created with the goal of making classification-based 
decisions about respondents' possession of multiple latent traits, also known as 
attributes. Researchers have brought a number of tools to bear on the problem 
of diagnostic classification, including multidimensional IRT, factor analysis, the 
rule-space method, the attribute hierarchy method, clustering methods, and 
cogniti ve diagnosis models fCDMs): for a recent review, see Rupp. Templin, and 
Henson CDMs are multidimensional latent variable models with a vector 



of binary latent variables representing mastery of a finite set of skills whose 
analysis results in a probabilistic attribute profile; this makes them well-suited 
to diagnostic classification. Well known models include the Deterministic Input, 
Noisy "And" Gate (DINA) model, the Deterministic Input, Noisy "Or" Gate 
(DINO) model, the Noisy Inputs, Deterministic "And" Gate (NIDA) model, 
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the Noisy Inputs, Deterministic "Or" Gate (NIDO) model, and the Conjunctive 



Reparanieterized Unified Model ( C-RUM), among others (IRupp et al.l. I201C 
Haertell. [19891: lJunker fc Siitsmal. [2OO1I: Ide la Torre fc Douglasl. l2004l: iMaris 



19991: iTemplin fc Hensonl. l2006l: IC Tatsuokal. l2002l: iTemplinl. l2006l: Ide la Tom 
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The DINA model is one of the best known and widely used CDMs. Under- 
lying the model is the assumption that, before slipping and guessing come into 
play, a respondent must have mastered all necessary (as specified by a loading 
matrix known as the Q-matrix) attributes required by a particular item in or- 
der to answer that item correctly. Thus it is a conjunctive, non-compensatory 
model, well-suited to educational assessments in areas such as mathematics 
where correct answers are obtained by correctly employing all of an item's re- 
quired skills together. The DINA model has been frequently employed in the 
analy sis of assessments, incl udin g the widely analyzed fraction subtraction data 
set of Ik. K. Tatsuokal (fl990l) (see Ide la Torrel. 120091: Ide la Torre fc Douglad . 12004 



l2008l: lTemplin. Henson. fc Douglasl. I2OO6I: iDeCarlol. I2OIII: Henson. Templin. & 



Willse. I2OO9I ). However, even after many refinements to the methodology there 



are still some persistent issues. In the fraction subtraction data, for example, 
respondents who answer all items incorrectly are often classified as having most 
of the skills (lDeCarlol .1 2OIII ). Classification issues of this type can result model 
misspecification, but they can also be the unavoidable consequence of the struc- 
ture of the assessment. Specifically in the DINA model, attrib utes that appea r 



solely in conjunction with other attributes are problematic (|DeCarlol . |201l|). 



This is due to an issue with attribute identifiability. which has long been k nown 



( K. K. Tatsuoy . Il99ll : iDiBello. Stout, fc Roussosl . Il995l : iDeCarlol . I2OI1I ). but 



tends to be ignored in practice. 

This paper gives a formal treatment of the identifiability issue of the DINA 
model and related CDMs. Since an assessment with fully identifiable attributes 
is often unavailable, we include guidelines for classification under non-idcntifiability 
and a consistent measure for the extent of non-identifiability. This allows clas- 
sification error control and assessment evaluation in terms of identifiability. 

The paper begins by reviewing some basic concepts, including the DINA 
model and its variants, in Section|21 We introduce the issue of identifiability for 
Q-matrix based assessments in Section |3l In Section |4l we explain the use of 
equivalence classes and partitions to fully describe the structure of the attribute 
profile space, in terms of identifiability; an algorithm to generate the partition, 
given any Q-matrix, is included. Partitioning allows consistent estimation of 
the proportion of individuals in each group of equivalent attribute profiles, as 
explained in Section [5] These results are extended to individual attributes via 
marginal identifiability in Section |S] In fact, the consistent estimation of the 
proportion of the population for which each attribute is marginally identifiable 
leads to a reliable measure of exam quality (with respect to identifiability). 
We also create a decision rule for respondent classification which controls mis- 
classification probabilities in Section [71 Section [8] examines the implications of 
these methods to several variants of the DINA, in addition to another Q-matrix 
based CDM, the DINO model. Finally, results derived from both simulation 
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and 



K. K. Tatsuoka' 



's fractions data set are reported in Section [HI 



2 Background 

Throughout the paper, we will be using some standard concepts from the study 
of CDMs. Some specific terminology and notations are listed below. 

Attributes arc the respondent's (unobserved) mastery of certain skills. If we 
suppose that there are N respondents and K attributes, let the matrix of 
attributes he A = (aifc), where where aik G {0, 1} indicates the presence 
or absence of the fc-th attribute in the i-th respondent. An attribute profile 
a = (ai, . . . , uk)^ is the vector of all attributes; an individual respondent 
i will have attribute profile a' such that a^. = aik- 

Responses arc the respondent's binary responses to items. Given N respon- 
dents and J items, the responses can be written as a TV x J matrix 
X = {Xij), where Xij G {0,1} is the response of the i-th respondent 
to the j-th item. The i-th respondent's responses will be denoted by the 
vector X', where the j-th element Aj = Ay for all 

The Q-matrix is the link between the items and their attribute requirements. 
It is a J X A matrix Q = (qjk), where for each j, k, qjk G {0, 1} indicates 
whether the j-th item requires the fc-th attribute. From the Q-matrix we 
can extract the attribute requirements of an item j as the vector , where 
the fc-th element q^, = qjk for all j, k. 

2.1 The DIN A Model 

This paper focuses on the DINA model, one of the most widely used CDMs. 
Under the DINA model, given an attribute profile a and a Q-matrix Q, we can 
further define the quantity 

K 
k=l 

which indicates whether a respondent with attribute profile a. possesses all the 
attributes required for item j. If we suppose no uncertainty in the response, then 
a respondent i with attribute profile a will have responses Xij — ^j(Q,a) for 
j ~ 1, . . . , J . Thus, the vector ^ = (^i, ■ . . , Cj)^ is known as the ideal response 
vector. 

In the DINA model, uncertainty is incorporated at the item level. With each 
item j = 1, . . . , J, we associate a slipping parameter Sj = P{Xj = 0|^j = 1) and 
a guessing parameter gj = P{Xj = = 0). Each Xj is Bernoulli with success 
probabihty (1 — Sj)^^ 9j ~^^ ■ Thus, the probability of a particular response vector 
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X given the ideal response vector ^ is 

P(x||,s,g) = n[(l - s,)i^g^-^r^\l - (1 - s,)^^g]-^^]'-^^ 

In addition to s and g, the response distribution also depends on u ~ (fQ:)ae{o,i}^ i 
the proportion of individuals possessing each attribute profile. Generally, diag- 
nostic classification is based on the posterior p{ck\x.), which is calculated using 
and can be very sensitive to the prior, u. When s, g, and u arc unknown, these 
parameters must be simultaneously estimated. 



2.2 Variants of the DIN A Model 

Several variants of the DINA can be constructed by restricting i/ to some 
lower-dimensional subspace. For example, assuming independence among the 
attributes so that 

K 

fc=i 

reduces the 2^ — 1-dimensional parameter space to a iiT-dimensional one. This 
restriction is referred to as the independent DINA (ind-DINA) from hereon. It 
is convenient to model each au with a logistic link, so that 

p(afc) = exp(Q;A;6A;)/[l + cxp(fefc)], 

where hk denotes the attribute's 'difficulty.' 

Anothe r alternative is the higher-order DINA (HO-DINA') model f de la Torre 
& DouglasHiooi" iTemplin. Henson. Templin. fc Roussod . l2008f ). T^his model 



assumes that the probability of possessing a skill is dependent on a continuous 
skill factor 9 following the standard normal distribution, so that 



v^=p{a.) = [ p{a\e)p{9)d9. 
Je 



Each individual attribute is assumed to be conditionally independent given 9, 
so that 

K 

p{a\9) = l[p{ak]9). 

k=l 

Finally the individual probabilities p{ak\9) can be modeled with a logistic link, 

p{ak\9) = exp{akibk + ak9))/[l + cxp(5fe + ak9)], 
where bk denotes the attribute's 'difficulty,' and ak is the attribute discrimina- 
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tion parameter. It is also possible to fit a restricted version of this m odel, for 
which all the Uk must be equal, as in de la Torre and DouglasI ( 2004). This is 
referr ed to as the restricted higher order DINA (RHO-DINA) model ( DeCarlol . 



201 Ih . 



3 Identifiability Issues in the DINA 

Diagnostic assessments are meant to provide detailed information about respon- 
dents' possession of a variety of traits. Preferably, a well-designed exam will be 
able to provide information about each trait for every respondent. However, 
recovering information about the latent variables from a '0' response may be 
difficult; in comparison to a '1' response, which suggests that a respondent is 
more likely to possess each attribute associated with that item, a '0' response 
may indicate the failure to master ony one or several of the required attributes. 
Consider the following two simple Q-matrices for the DINA model: 

In assessments based on the Q-matrix Qi, a correct response to each item gener- 
ally indicates a higher probability that the respondent possesses the correspond- 
ing attribute, while an incorrect response indicates a lower probability of the 
same. However, with Q2, an incorrect response to the second item only implies 
that at least one of the attributes is probably missing. In fact, given that a stu- 
dent does not possess Attribute 1, Item 2 provides no information about his or 
her mastery of Attribute 2, and so respondents with attribute profiles (0, 0) and 
(0, 1) have statistically identical responses. Thus, the assessment as a whole 
is incapable of differentiating between the two profiles, and any classification 
decision between them will solely be a reflection of the prior information. 

A slightly more complicated situation appears if we add a third attribute 
to the example above. Consider an assessment following the DINA model with 
Q-matrix Q3, where 

/I 0\ 

Q3 = 1 1 . (3) 
\0 1 ij 

The attribute requirements of the first two items match those of the items cor- 
responding to Q2- Now, however, the proportion of individuals for whom At- 
tribute 2 is not identifiable is smaller. Of those who do not possess Attribute 1, 
some will possess Attribute 3. Then Attribute 2 is identifiable because of differ- 
ing response distributions on Item 3. However, response distributions for those 
with attribute profiles (0,1,0) and (0,0,0) are still indistinguishable. Thus, al- 
though the assessment provides no information about Attribute 2 for a smaller 
part of the population, the issue has not been completely resolved. 
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4 Partitioning the Attribute Profile Space 



We begin with an intuitive criterion for deciding whether an assessment has the 
abihty to differentiate between two attribute profiles. 

Definition 1. Two attribute profiles are separable if they lead to different re- 
sponse distributions. 

The differing response distributions of separable attribute profiles imply that 
the data will favor one profile or the other; there is some differential effect on the 
likelihood and thus the posterior. Profiles that are not separable are statistically 
identical, with equivalent likelihood functions, making any differences in their 
posteriors simply artifacts of the prior. 

Determining whether attributes are separable can be done without the full 
response distribution;in fact, only the ideal responses (,{Q,cx) arc necessary. 

Proposition 1. Given a Q-matrix Q and slipping and guessing parameters s 
and g, two attribute profiles and can be separated if and only if they 
produce ideal response vectors = ^(Q,a^) and = (Q,a'^) such that for 
some j £ {1, . . . , J}, 7^ ^ j '^'^'^ ^ ~ 9j- 

Throughout the rest of this paper we assume that 1 — sj ^ gj for each 
j = 1, . . . , J, which simplifies Proposition [1] into Corollary [TJ Should such an 
item indeed be present, then it has no discriminating power and may be omitted. 

Corollary 1. // every item j has different success probabilities given = 1 or 
given =0, i.e. 1 — Sj ^ gj for j = 1, . . . , J, then two attribute profiles can be 
separated if and only if they produce different ideal response vectors. 

Lastly, it is also of interest whether an attribute profile can be separated 
from all other attribute profiles, and is thus identifiable. This definition of 
identifiability will be tied to the general statistical concept in Section [5l 

Definition 2. An attribute profile cx is identifiable when it can be separated 
from any other attribute profile a' a. 



4.1 Complete Separation of Attribute Profiles 



The first step in understanding the identifiability issue is determining under 
what circumstances all attribute profiles are identifiable. This depends on the 
Q-niatrix, wh ich is called complete when it leads to full identifiability f Chiu. 
Douglas, & Li. l2009f ). Formally, we have the following definition: 



Definition 3. Under a complete Q-matrix, all attribute profiles are identifiable, 
I.e. ^(Q,a) ^^(Q,a') iff a ^ cx' . 



T he requirements for completeness have long been known (|K. K. Tatsuoka 



199ll ; iDiBello et all . Il995l : IChiu et all . |2009| ). In essence, the assessment must 
contain at least one item devoted solely to each attribute. In terms of the Q- 
matrix, this means that for each k G {1, • . • ,K^, there should be at least one 
row with an entry of '1' solely in the fc-th position. 



6 



Proposition 2. Let Rq be the set of row vectors of Q-matrix Q. Then Q is 
complete iff {cfe : fc = 1, . . . ,K} C Rq, where is a vector such that the k-th 
element is one and all other elements are zero. 



4.2 Partial Separation of Attribute Profiles 

While a complete Q-matrix is necessary for full identifiability, many of the 
Q-matrices used in practice are unfortunately incomplete. In fact, creating as- 
sessments with complete Q-matrices is oftentimes infeasible, and requiring a 
complete matrix for analysis would make models like the DINA model highly 
impractical. This makes the partition, a standard mathematical construct, an 
essential tool in accurately and systematically describe the structure of noniden- 
tifiability in the DINA. Partitions are formed from equivalence relations, which 
have the following requirements: 

Definition 4. The relation 'a b' is an equivalence relation if it is 

• reflexive: a ^ a 

• symmetric: a b ijf b a 

• transitive: If a ^ b and b c, then a ^ c. 

Proposition 3. Let ''^ ' denote the binary relation 'cannot be separated, ' where 
^ if and only if ^{Q, a^) = ^{Q,a^). Then '~ 'is an equivalence relation. 

Putting profiles into groups, known as equivalence classes, based on an equiv- 
alence relation results in a partition; in this case, any two attribute profiles in 
the same equivalence class cannot be separated, while any two in different classes 
can be. We denote a particular equivalence class by [a], where a may be any 
attribute profile in the class; literally, [a] can be read as "the set of attribute 
profiles equivalent to a." 

The simplest way of determining the partition would be to calculate the ideal 
response vector of each of the 2^ attribute profiles and sort them lexicographi- 
cally. This runs quickly in 0{JK-2^) time (refer to Table[T]for the step-by-step 
algor ithm). For an alternative algorithm using Boolean algebra, see K. K. Tat- 



suoka (|l99lf ). Note that our algorithm results in equivalence classes labeled by 
their smallest member, which shall be called the minimal representative. The 
minimal representative has additional meaning as the attribute requirements 
of the corresponding ideal response vector and is therefore the most preferable 
member for labeling. 

As seen in Table [2l performing the algorithm on the 3x3 Q-matrix Q3 
from ^ results in five different equivalence classes, each of which is labeled 
with by its minimal representative: [000] = {000,010,001}, [Oil] = {Oil}, 
[100] = {100,101}, [110] = {110}, and [111] = {111}. Note that since the 
bracket notation may be read as 'the equivalence class containing,' it is possible 
to change the labeling of each equivalence class by choosing any other member 
as the titular profile: [000], [010], and [001] all refer to the same equivalence 
class, for example. 
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Table 1: Algorithm for partitioning an attribute profile space 



Step 


Procedure 


Input: 


A J X K Q-matrix Q. 


(0) 


(optional) Remove items with duplicate attribute requirements 


(1) 


List all 2^ attribute profiles ex.. 


(2) 


Find the ideal response vector ^{Q, a) for each ex. 


(3) 


Do a lexicographic (alphabetic) sort of the ideal response vec- 




tors. 


(4) 


Check whether each successive profile has the same ideal re- 




sponse vector as the previous profile. If so, a is the first mem- 




ber of a new equivalence class [a]. Else, a is part of the current 




equivalence class. 


Output: 


A list of equivalence classes [a] and their members. 



Table 2: Generating the partition associated with the Q-matrix 




Steps from Table [2 labeled (1), (2), (3), and (4). 



000 000 000 000 [000] 

100 100 010 000 

010 000 001 000 

(1) (2) (3) (4) [^11] 

i 110 110 -i 100 100 -I [100] 

101 100 101 100 

011 001 110 110 [110] 

111 111 111 111 [111] 



5 Consistent Estimation 

We now consider the problem of parameter estimation, specifically that of 
the proportion of the population possessing each attribute profile a. Unless v> 
is assumed known, its consistent estimation has important consequences for re- 
spondent classification and exam validity. Unfortunately, when an assessment's 
Q-matrix is incomplete, it is impossible to consistently estimate u. For each 
equivalence class [a] , let be the proportion of the population possessing an 
attribute profile within that equivalence class. Then, 

a'e[a] 
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The probability of observing any particular set of data depends only on v^a], 
since the probability of any response depends only on equivalence class mem- 
bership, not on the respondent's possession of a specific profile. With an incom- 
plete Q-matrix it is possible to observe populations with different distributions 
^ over the attribute profile space that have identical distributions over 
the equivalence classes [a], i.e., v^^-^ = j^p^j for all a £ {0, l}-^, and thus iden- 
tical response distributions. The phenomenon where different parameter values 
lead to identical response distributions is generally known as non-identifiability, 
and it destroys the ability of likelihood-based estimation methods to achieve 
consistency. 

While consistent estimation of cannot be achieved, it is possible to con- 
sistently estimate the proportion of individuals within each equivalence class 
[a]. 

Theorem 1. Suppose an assessment follows the DIN A model, with known Q- 
matrix Q and item parameters s and g. Let representing the proportion 

of the population possessing an attribute profile a' S [ct], be defined as in ([¥]), 
and let the population parameter u be the vector of all v^a] ■ We may write its 
likelihood as 

N N 

L{u) = p{X\u) = \{p{A^) = nE?'(^'IH)^[«]. 

i=l i=l [a] 

Then the maximum likelihood estimate u of v is consistent as iV —5- oo. 

Consistent estimation of the is an important result, justifying the results 
of the following sections. To emphasize the differences in parameter space and 
procedure, work based on equivalence classes [a] rather than profiles a. will from 
hereon be referred to under the name of the Non-Identifiability ADjusted DIN A 
(NIAD-DINA) model. 

6 Marginal Identifiability 

We now wish to extend the concept of identifiability to individual attributes. 
This is motivated by the fact that, though the presence of multiple profiles in 
the same equivalence class signals non-identifiability, some individual attributes 
may still be identifiable within the class. To illustrate, consider the the Q- 
matrix from ^ and one of its equivalence classes, [000] = {000, 010, 001}. If 
a profile a G [000], then its first component ai = 0, but the values of a2 and as 
are uncertain. Thus, posterior weight p([000]|x) on this class counts as positive 
evidence that ai = 0, but does not help in deciding a2 or a^. This observation 
motivates the following definition: 

Definition 5. An attribute is marginally identifiable within an equivalence class 
when either all members of that class possess that attribute or none of them do. 
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Define the marginal identifiability indicator S[ci].k a-s follows: 

^H,fc= n n (5) 

Then. (5[q,],/j = 1 when Attribute k is marginally identifiable within equivalence 
class [a]. Posterior weight on a class [a] only provides information about the 
fc-th attribute when ^[q.],*: — 1; otherwise, there is no information beyond the 
prior. 

6.1 Exam Quality: the Marginal Identifiability Rate 

Since non-idcntifiability is frequently unavoidable with Q-matrix based CDMs, 
it is important to measure its extent. For a more nuanced view, this is done on 
a marginal, basis. 

Given the proportion of each attribute profile a, the proportion of the 
population for which the fc-th attribute is marginally identifiable can be quan- 
tified by (k, as follows: 

{a:<5[a],lc = l} 

Let ^ be the vector of all (k- Then Q is the proportion of the population for 
which each attribute is marginally identifiable, i.e., the marginal identifiability 
rate. 

Oftentimes i^a, and thus ^, is unknown. Under the conditions of Theorem (TJ 
^ can be consistently estimated by its maximum likelihood estimator ^. 

Proposition 4. Suppose an assessment follows the DINA model, with known 
Q-matrix Q and item parameters s and g. Let v^a] be the MLE estimate of v^a] ■ 
Then 

Ck= k = l,...,K. (7) 

{H:5[„j,fc = l} 

is consistent as N ^ oo. 

The consistency of ^ is a direct consequence of the consistency of in 
Theorem [1] We thus obtain a very reasonable measure of exam quality, in 
terms of the proportion of the population for which each attribute is marginally 
identifiable. 

7 Classification 

Non-identifiability has potentially serious effects on respondent classification. 
Classification is generally conducted based on the posterior distribution p(a |x) oc 
p{x.\a)p{(y). Recall that profiles in the same equivalence class have the same 
likelihood. Thus, the posterior will simply be a reflection of the prior, without 
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any added information from the data (|DeCarld . l201l[ ). In fact, within an equiv- 



alence class the posteriors are proportional to the priors. Given a prior p{a), 
for any a^, G [a], 

p{a^\x) p{x\a^\)p{(y.^) p{x\[a.])p{a^) p{<y-^) 



p(a2|x) p(x|a2)p(Q,2~) p(x\[cy.])p{a'^) p{a^) 

Posteriors are often calculated by maximizing the marginal maximum likelihood 
Liv g) via the E-M algorithm (jHaertell . Il989l : Ide la Torrd . l2009l : iRupp et al ' 



201(t l. Then, since all vectors u with identical weights on each class have 



identical likelihoods, any v achieving the maximizing v^^] may result. The 
values chosen are determined by the starting values, which have little validity 
for classification. 

When the posterior is sensitive to the prior, it is important to work with 
p([q:]), which can be estimated consistently, rather than p(a). Thus classifi- 
cation here will be conducted based on p([a]|x oc p(x| [a])p([Q:]) instead of the 
usual posterior. This calculation does not require a separate fitting of the model, 
since 

a'G[a] 

From this posterior, we then define 

[a]:rf[c],fc=0 

where is the marginal identifiability indicator defined in ([5]). Classi- 

fication follows from the fact that, depending upon the specific hyperprior 
on 1/ or starting point of the E-M algorithm, the DINA model may produce 
marginal posterior probabilities of mastery p{ak = l|x) anywhere in the range 
[p™'"(x),p™^''(x)]. Thus, it is only appropriate to conclude that afc = 1 when 
p™'"(x) is high, or that ak = when p'^^^^{x) is low. A natural cutofl[ for both 
is 0.5, but it may be adjusted as necessary. This classification method, from 
hereon referred to as the NIAD-DINA classification algorithm, accounts for both 
uncertainty in the prior and uncertainty caused by slipping and guessing. It is 
summarized in Table [31 
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Tabic 3: NIAD-DINA classification algorithm 



Step 


Procedure 


q.v. 


Input: 


Q-matrix Q = {qjk)jxK, data X = {xik)Nx.J- 




(1) 


Fit the model to produce p(a|x). 




(2) 


Partition the attribute profile space. 


Table [B 


(3) 


Calculate the marginal identifiability vector S^^cx] ■ 


® 


(4) 


Sum posteriors p{a\x) for p([q:] x). 


® 


(5) 


Calculate p™™(x) and p™^'^(x) for every fc,x. 




(6) 


Classify: 






If j^r" > 0-5, then ak^l. 






jf ^max < 0.5, then ak = 0. 






Else, ctk = * (unclassified). 




Output: 


Classifications Q;)^ £ {0, 1, *} for all i, k. 





8 Extensions 

8.1 Variants of the DIN A 

The methodology of partitioning in Section |4] applies to any model where the 
presence of a difference in the ideal response pattern fully determines the pres- 
ence of a difference in the likelihood function. Included among these models 
are all variants of the DINA model listed in Section 12.21 Since these variants 
are, in essence, a restriction on the space of i^, the consistency result for i^Jq,] 
in Section \E\ along with all the following results, holds when the model is in 
fact correct. However, if the true z/[„] do not fall under the set of values con- 
sistent with the restriction on the parameter space, then even estimates of v^a] 
will no longer be consistent. Thus, large differences in the z/Jq,] calculated un- 
der restricted models from those calculated under the NIAD-DINA model are 
symptomatic of model misspecification, and may imply that the DINA variant 
chosen is overly restrictive on the prior. Goodness-of-fit measures such as the 
AIC and BIC will reflect lack of fit appropriately if the saturated model has the 
correct number of parameters 2 J -|- L, rather than 2 J + 2^ — 1. 

8.2 The DINO Model 

The DINO model also specifies item and attribute relationships using a Q- 
matrix, but the ideal responses are calculated as 

K 

^,{Q, a) = 1 - - aty^" = IK = q,k = 1 for some k). 

fe=i 
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As in the DINA model, the response probabilities are functions of item pa- 
rameters Sj = P{Xj = 0\^j = 1) and gj = P{Xj = = 0). Under the 
DINA model, ideal responses are correct when the respondent possesses all re- 
quired attributes; under the DINO model, ideal responses are incorrect when 
the respondent does not possess all required attributes. Thus, for responses X 
following the DINO model, the reverse responses 1 — X follow the DINA model 
(with a reversed interpretation of the attribute profile vectors). This dualism 
implies that all the results of this paper apply to the DINO model. 

9 Results 

9.1 Simulation Results 

We first demonstrate the procedures on simulated data. Responses are gener- 
ated for N — 5000 rcsondcnts taking an assessment with J ~ 6 items measuring 
K = 3 distinct attributes. The respondents' mastery or nonmastcry of the mea- 
sured attributes is randomly generated according to the probability psim{(^) of 
each profile a g {0,1}"^, as listed in Table SI The responses themselves fol- 



Table 4: Population proportions of each attribute profile 



a 


000 


001 


010 


on 


100 


101 


110 


111 


Psirn 0.27 


0.00 


0.01 


0.04 


0.10 


0.16 


0.20 


0.21 



low the DINA model according to the Q-matrix Qsim with slipping Saim and 
guessing gsim as shown in Table [5] 



Table 5: Q-matrix, slipping, and guessing for simulated data. 



Item {j) 


Attribute vector (q^) 


Slipping (sj) 


Guessing {gj) 


1. 


100 


0.14 


0.10 


2. 


110 


0.12 


0.15 


3. 


Oil 


0.18 


0.18 


4. 


100 


0.17 


0.18 


5. 


110 


0.08 


0.06 


6. 


Oil 


0.05 


0.06 



The Q-matrix Qsim is incomplete, and the resulting instability in the pos- 
terior becomes clear once the data is fitted multiple times. As an example, the 
posterior probabilities of each attribute profile given the zero response vector 
= (0, 0, . . . , 0) are summarized in Table El Here, the DINA, HO-DINA, and 
RHO-DINA are overparameterized and produce a wide range of results for Pro- 
files [000], [001], and [010]. The slight variability in the ind-DINA estimates is a 
numerical artifact. While the ind-DINA does not suffer from nonidentifiability, 
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Table 6: Posterior probabilities given zero correct responses, p{a\x = 0) 



a 

000 001 010 oil 100 101 110 111 



truth 





0, 


.91 


0, 


.02 


0, 


.05 


0, 


.00 


0, 


.01 


0, 


.01 


0, 


.00 


0, 


.00 


mininiums 


































DINA 





.01 


0, 


.02 


0, 


.03 


0, 


.00 


0, 


.00 


0, 


.00 


0, 


.00 


0, 


.00 


HO-DINA 


0, 


.13 


0, 


.09 


0, 


.03 


0, 


.00 


0, 


.01 


0, 


.01 


0, 


.00 


0, 


.00 


RHO-DINA 


0, 


.55 


0, 


.11 


0, 


.29 


0, 


.00 


0, 


.02 


0, 


.01 


0, 


.00 


0, 


.00 


ind-DINA 


0, 


.29 


0, 


.24 


0, 


.37 


0, 


.00 


0, 


.02 


0, 


.02 


0, 


.00 


0, 


.00 


maximums 


































DINA 


0, 


.71 


0, 


.86 


0, 


.56 


0, 


.00 


0, 


.02 


0, 


.03 


0, 


.00 


0, 


.00 


HO-DINA 


0, 


.62 


0, 


.81 


0, 


.71 


0, 


.00 


0, 


.02 


0, 


.02 


0, 


.00 


0, 


.00 


RHO-DINA 


0, 


.58 


0, 


.13 


0, 


.30 


0, 


.00 


0, 


.02 


0, 


.01 


0, 


.00 


0, 


.00 


ind-DINA 


0, 


.31 


0, 


.28 


0, 


.40 


0, 


.01 


0, 


.03 


0, 


.02 


0, 


.00 


0, 


.00 



Note: Minimum and maximum values of the posterior p(a\x = 0), as 
generated over ten runs of the (random start) E-M algorithm. 

it still does not give accurate estimates in this case since the model assumptions 
are incorrect. 

Partitioning the attribute profile space as directed by Table [T] produces the 
five equivalence classes listed in Table [71 two of which have multiple members. 
The table also reports the marginal identifiability vector 6[a] for each class. Note 
that since Items 1 and 4 arc devoted to Attribute 1, it is always marginally 
identifiable and = 1- It is also clear that nonidentifiability most seriously 

affects Attribute 3, which is marginally non- identifiable for members of both 
[000] and [100]. Finally, Table [7] also reports E-M estimates of the proportion 
of respondents in each class under the DINA and several variants, along with the 
true proportion. Note the accuracy of the DINA estimates, which are consistent, 
and the inaccuracy of the ind-DINA estimates due to model misfit. 

Table 7: Equivalence classes, along with their class sizes, true and maximum 
likelihood probabilities, and marginal identifiability vectors. 



[a] 


Size 


True 


DINA 


HO-DINA 


RHO-DINA 


ind-DINA 


^[°] 


[000] 


3 


0.29 


0.30 


0.29 


0.29 


0.22 


100 


[100] 


2 


0.26 


0.26 


0.27 


0.26 


0.31 


no 


[Oil] 


1 


0.04 


0.04 


0.04 


0.04 


0.08 


111 


[110] 


1 


0.20 


0.20 


0.19 


0.20 


0.20 


111 


[111] 


1 


0.21 


0.21 


0.21 


0.21 


0.18 


111 
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We now consider variability in the marginal posterior probabilities p{ak = 
l|x). Table |8] gives the sample range of p{ak = l|x) after ten runs of the E-M 
algorithm, in addition to the theoretical range. Note the large theoretical ranges 
forp(afc = l|x = 0),fc = 2,3. 

Table 8: Variability in p{ak = l|x = 0), the marginal posterior given the zero 
response vector. 







k 






1 


2 


3 


sample min 

pr"(o) 


0.03 
0.03 


0.03 
0.00 


0.04 
0.00 


sample max 

pr"(o) 


0.03 
0.03 


0.56 
0.97 


0.89 
1.00 



Note: Probabilities calculated by fitting the DINA model over ten runs of E-M 
algorithm with random starts. 

Classification was conducted on a marginal basis, based on p(Q!fc — l|x), un- 
der each of the models. In addition, NIAD-DINA classification was performed 
(see Table |3]). Marginal misclassification rates p{ak ^ oik) arc compared in Ta- 
ble [HI Note that NIAD-DINA classification results in unclassified individuals; 
for example, a = {0 * *) for those with the zero response vector. This rate is 
also listed in Table 13 The DINA and HO-DINA are overparameterized and 
the misclassification rate for Attribute 3 may reach over 40% in both models. 
The ind-DINA also performs poorly, but due to an overly restricted parameter 
space rather than nonidcntifiability. Adjusting classification under the DINA 
to account for nonidcntifiability according to the method described in Section [7] 
solves both these issues. It may leave a large proportion of individuals unclas- 
sified, but this is a necessary consequence of the assessment design. Classifying 
these individuals would require further assumptions beyond the model. 



Table 9: Marginal misclassification rates under a variety of models. 



k 






Model 






DINA 


HO-DINA 


RHO-DINA 


ind-DINA 


NIAD-DINA 


1 


0.07 


0.07 


0.07 


0.09 


0.07 (0.00) 


2 


0.07-0.32 


0.07-0.32 


0.07 


0.26 


0.04 (0.32) 


3 


0.19-0.44 


0.19-0.43 


0.20 


0.21 


0.04 (0.56) 



Note: Range over 10 runs reported for overparameterized models. All cut-offs 
equal to 0.5. The proportion of respondents left unclassified under the 
NIAD-DINA is displayed within parentheses. 

In addition to controlling misclassification errors, we may also evaluate the 
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quality of the assessment by measuring the marginal identifiability rate C defined 
in (|6]). Table [TO] shows both true and estimated values for ^. Note once again 
that nonidcntifiability affects Attribute 3 more severely than it does Attribute 
2. In addition, estimates are generally accurate, except in the case of the ind- 
DINA, which suffers from lack of fit. 

Table 10: True and estimated values for ^, marginal identifiability rate. 



k 


true 






Model 




DINA 


HO-DINA 


RHO-DINA 


ind-DINA 


1 


1.00 


1.00 


1.00 


1.00 


1.00 


2 


0.71 


0.70 


0.71 


0.71 


0.78 


3 


0.44 


0.45 


0.45 


0.45 


0.47 



In terms of model selection, reducing the number of parameters for the DINA 
model to 2M + L from the original 2M + 2^ reduces the comparative advantage 
of the restricted models. In Table [TTl the AIC value of the RHO-DINA barely 
edges out that of the DINA with identifiability adjustment. 

Table 11: AIC and BIG for the DINA, RHO-DINA, and ind-DINA. 



parameters AIC BIC 

NIAD-DINA 17 32862.8 32973.6 

RHO-DINA 16 32861.2 32965.5 

ind-DINA 15 32995.9 33093.6 



9.2 A Fractions Data Example 

Wc n ow turn to the widelv analvzed fraction subtraction data set of K. K. Tat- 
suoka ( 1990( ). It is composed of the twentv items listed in Table \\% The 



Table 12: Items from the fraction subtraction data set ( K. K. Tatsuoka . Il990h . 



No. 


Item 


No. 


Item 


No. 


Item 


1. 


5/3 - 3/4 


8. 


2/3 - 2/3 


15. 


2 - 1/3 


2. 


3/4 - 3/8 


9. 


37/8-2 


16. 


45/7-1 4/7 


3. 


5/6 - 1/9 


10. 


44/12 - 2 7/12 


17. 


73/5-4/5 


4. 


31/2-2 3/2 


11. 


41/3-2 4/3 


18. 


4 1/10 - 2 8/10 


5. 


43/5-34/10 


12. 


11/8 - 1/8 


19. 


4-14/3 


6. 


6/7 _ 4/7 


13. 


3 3/8 - 2 5/6 


20. 


41/3-1 5/3 


7. 


3-21/5 


14. 


34/5-32/5 







Q-matrix in Table [T3l comes from de la Torre and Douglas ( 20041 ). and specifies 
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the following eight attributes: ai = convert a whole number to a fraction; a2 = 
separate a whole number from a fraction; = simplify before subtracting; 
a4 = find a common denominator; = borrow from whole number part; ae = 
column borrow to subtract the second numerator from the first; ay = subtract 
numerators; and as = reduce answers to simplest form. 



Table 13: Q-matrix from lde la Torre and Douelad (120041 ) . 



Item 


Attributes (q^) 


Item 


Attributes (q^ ) 


Item 


Attributes (qJ) 


1. 


00010110 


8. 


00000010 


15. 


10000010 


2. 


00010010 


9. 


01000000 


16. 


01000010 


3. 


00010010 


10. 


01001011 


17. 


01001010 


4. 


01101010 


11. 


01001010 


18. 


01001110 


5. 


01010011 


12. 


00000011 


19. 


11101010 


6. 


00000010 


13. 


01011010 


20. 


01101010 


7. 


11000010 


14. 


01000010 







As pointed out by iDeCarlo (2011), this assessment exemplifies the identifia- 



bility issues of the DINA model. While Attributes 2 and 7 have items dedicated 
solely to them, all other attributes appear only in combination. In fact, At- 
tribute 3 only appears in Item 4, in conjunction with Attributes 2, 5, and 7. 
Attribute 7 is required for all items except one, making it difficult to draw con- 
clusions about other attributes when it has not been mastered. TablefHldisplavs 
the marginal posterior probabilities of mastery for each attribute, given the zero 
response vector. The posterior displayed for the DINA is just one possible out- 
put of the E-M algorithm for this data; meanwhile, note the high probabilities 
of mastery under the ind-DINA model. Common sense dictates that something 
is out of place when the analysis states that students with a score of zero can- 
not subtract numerators, but can do everything else, from finding a common 
denominator to borrowing to reducing to simplest form. 

Table 14: Marginal posterior probabilities of mastery given the zero response 
vector, p{ak = l|x = 0) 



k 





1 


2 


3 


4 


5 


6 


7 


8 


DINA 


0.50 


0.08 


0.50 


0.52 


0.53 


0.41 


0.00 


0.59 


HO-DINA 


0.00 


0.13 


0.31 


0.05 


0.02 


0.30 


0.00 


0.25 


RHO-DINA 


0.02 


0.13 


0.12 


0.05 


0.02 


0.25 


0.00 


0.18 


ind-DINA 


0.74 


0.86 


0.96 


0.86 


0.75 


0.98 


0.00 


0.94 



With eight attributes in the Q-matrix, there are a total of 256 possible at- 
tribute profiles. They can be divided into just 58 different equivalence classes by 
the partitioning algorithm, 32 of them containing a single identifiable element. 
The 26 multiple-profile equivalence classes are listed in Table [151 which also 
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Tabic 15: Multiple-member equivalence classes, along with their class sizes, 
maximum likelihood probabilities, and marginal identifiability vectors. 



[a] 


Size 


DINA 


HO-DINA 


RHO-DINA 


ind-DINA 


Sin] 


[00000000] 


64 


0.15 


0.12 


0.12 


0.02 


01000010 


[01000000] 


64 


0.04 


0.06 


0.06 


0.31 


01000010 


[00000010] 


8 


0.01 


0.02 


0.03 


0.00 


11010011 


[10000010] 


8 


0.00 


0.00 


0.00 


0.00 


11010011 


[00000011] 


8 


0.02 


0.03 


0.02 


0.00 


11010011 


[10000011] 


8 


0.00 


0.00 


0.00 


0.00 


11010011 


[01000010] 


4 


0.03 


0.04 


0.04 


0.00 


11011011 


[01000011] 


4 


0.11 


0.09 


0.08 


0.00 


11011011 


[11000010] 


4 


0.00 


0.00 


0.00 


0.00 


11011011 


[11000011] 


4 


0.01 


0.01 


0.02 


0.01 


11011011 


[00010010] 


4 


0.00 


0.00 


0.00 


0.00 


11010111 


[lOOlOOlOJ 


4 


o.uu 


U.OO 


O.UU 


o.uu 


11010111 


[00010011] 


4 


0.00 


0.00 


0.00 


0.00 


11010111 


[10010011] 


4 


0.00 


0.00 


0.00 


0.00 


11010111 


[00010110] 


4 


0.02 


0.00 


0.00 


0.00 


11010111 


[10010110] 


4 


0.00 


0.00 


0.00 


0.00 


11010111 


[00010111] 


4 


0.00 


0.01 


0.01 


0.01 


11010111 


[10010111] 


4 


0.00 


0.00 


0.00 


0.03 


11010111 


[01010010] 


2 


0.00 


0.00 


0.00 


0.00 


11011111 


[11010010] 


2 


0.00 


0.00 


0.00 


0.00 


11011111 


[01010011] 


2 


0.01 


0.01 


0.00 


0.00 


11011111 


[11010011] 


2 


0.00 


0.00 


0.00 


0.00 


11011111 


[01010110] 


2 


0.00 


0.01 


0.01 


0.00 


11011111 


[11010110] 


2 


0.00 


0.00 


0.00 


0.01 


11011111 


[01010111] 


2 


0.05 


0.06 


0.06 


0.03 


11011111 


[11010111] 


2 


0.06 


0.06 


0.06 


0.09 


11011111 



displays their class sizes, maximum likelihood probabilities, and marginal iden- 
tifiability vectors. Within these multiple-profile equivalence classes. Attributes 
2 and 7 are always marginally identifiable, while Attribute 3 is never so; this 
is natural considering our previous observations about the Q-matrix. Profiles 
within the largest classes contain many zeroes, since under the DINA model 
non-identifiability aff'ects a particular attribute only for respondents who do not 
possess other attributes used in combination with that attribute. Also note 
that the ind-DINA shows signs of model misspecification, since its estimates 
{/[a] deviate strongly from the estimates derived from the other models. 

Table [16] shows the estimated marginal identifiability rates, ^. At the low 
end, Ca ~ 0.48, bringing into question the ability of this assessment to measure 
mastery of Attribute 3. Attribute 6 does only slightly better, with = 0.64. 
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Note that Attribute 6 is only utilized in Items 1 and 18; in both cases it appears 
in conjunction with at least two other attributes. 

Table 16: Estimated marginal identifiability rates (^k- 



k 





1 


2 


3 


4 


5 


6 


7 


8 


DINA 


0.81 


1.00 


0.48 


0.81 


0.75 


0.64 


1.00 


0.81 


HO-DINA 


0.82 


1.00 


0.47 


0.82 


0.75 


0.64 


1.00 


0.82 


RHO-DINA 


0.82 


1.00 


0.48 


0.82 


0.75 


0.63 


1.00 


0.82 


ind-DINA 


0.66 


1.00 


0.47 


0.66 


0.62 


0.64 


1.00 


0.66 



Finally, it is useful to consider how the reduction of the parameter space for 
the DINA model, based on identifiability, affects model selection by AIC and 
BIC. In particular, the AIC no longer prefers the ind-DINA to the full DINA 
model once the reduced parameter space has been applied. The BIC, which will 
generally choose sparser models than the AIC, still reports lower values for the 
ind-DINA, but the comparison is much tighter. 

Table 17: AIC and BIC for the DINA, RHO-DINA, and ind-DINA. 





parameters 


AIC 


BIC 


DINA 


296 


9397.0 


10665.2 


NIAD-DINA 


98 


9001.0 


9420.9 


HO-DINA 


56 


8959.7 


9199.6 


RHO-DINA 


49 


8961.9 


9171.9 


ind-DINA 


48 


9208.3 


9413.9 



10 Discussion 

In general, it is difficult to obtain a complete Q-matrix. Oftentimes, due to the 
demands of practicality, assessments must involve items that require a combina- 
tion of skills. Using the tools discussed in this paper, it is possible to determine 
the extent to which nonidentifiability affects classification and estimation under 
the DINA model. Marginal identifiability rates which can be estimated consis- 
tently, provide an overall measure of the extent of non-identifiability; meanwhile, 
NIAD-DINA classification takes marginal identifiability into consideration in or- 
der to control classification errors that are otherwise quite sensitive to the prior 
information. The results here suggest that when designing items to test a par- 
ticular attribute, if using a combination of skills is unavoidable, it is best to 
combine that attribute with basic attributes mastered by a large proportion of 
the population. After all, it is only impossible to recover information about a 
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particular attribute when the respondent does not posess one or more of the 
other attributes tested by the same item. 

Many of the methods currently in use may resolve idenifiability issues by 
enforcing restrictions on the attribute profile space. Variants of the DINA such 
as the ind-DINA, HO-DINA, and RHO-DINA accomplish this by specifying 
a structure and a prior on the probabilities p{cy.) = v^. Although this may 
eliminate non-identifiability and create a unique global maximum for the likeli- 
hood, model misspecification becomes a risk. Thus, careful comparison of these 
variants to the NI AD-DIN A becomes important. 
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A Proofs 

Proof of Proposition [7J Suppose = ^| for all j such that 1 — sj ^ gj . If 
1 — Sj = gj, then the response distribution for item j does not depend on ^: 

={i-s,r^s]-''^ =g;^{i-g,r-^^. 

Thus, for every x G {0, l}*^, 

m 

^'(x|C\s,5) =Y[Pixj\^},Sj,gj) 

= n n P{xMl^i^9j)^P{M^\s,g) 

and cx^ cannot be separated from ot^. 

Now suppose that ^ ^| for some j such that 1 — Sj ^ . Then 

^ = (1 - ^ gf {I - s,)'-^^ 

= g'r^\l-s,)i^ =P{x, = l\e,s,g), 

so the response distributions differ. □ 

Proof of Proposition\^ Suppose that Q is complete. WLOG, for j = 1, . . . , JsT 
let q-' = Cj. Then, for j = 1, . . . , if, ^j{Q, a) = aj and given any two attribute 
profiles ck^ ^ o? , 

S.1:k{Q, =0^ i^O? ^ S.1:k{Q, Oi^) 

By Proposition [U Q separates any a.^ ^ o? . 

Now suppose that G {1, . . . , K) such that e^^ ^ Rq. WLOG, suppose 
fc* = 1. Consider profiles ol ~ ei and a' = 0, the zero column- vector. For each 
item j = 1, . . . , J, if qji = then 

C, (Q, ei) = (1°) n 0'^'= = (0") n 0'^" = ^. (Q' 0)- 
Else, gji = 1 and there exists some k^^ ^ 1 such that qjk.-^ = 1 and 

^ (Qi) n 1)]'^" = = (0^) n 0'^' = ^^-(Q'O)- 
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Thus, ^{Q,ei) = ^(Q,0) and by Proposition [TJ attribute profiles ei and 
cannot be separated. □ 

Proof of Proposition\^ The relation is (i) reflexive: ^((5,a) — ^{Q,a) =^ 
a a for all profiles a, (ii) symmetric: if ^ a^, then ^{Q, a^) ~ ^{Q, a^) 
and for any profiles a.^,cip; (iii) transitive: if ^ and ^ a^, 

then ^{Q,a^) = ^{Q,a.^) = ^(Q,a'') and ^ for any profiles 0.^,0.^,0.'^. 

□ 

Proof of Theorem [H We can write the likelihood 

TV 



and the log-likelihood as 

/ 

^H = ^logp(x»- J2 A^xlog^p(x|[a])i.[< 
i=i xe{o,i}'' Vh 

where A^x = #{* : = x}. 

We first check for identifiability. Suppose there are L distinct equivalence 
classes partitioning the attribute profile space. The probability of each vector 
x given each class [a] can be written as a 2"^ x L matrix P = (px,[a]): where 
Px,[a] ~ Pi^\[^])- The vector of total probabilities for each response vector x, 
for any u, can be written as the matrix product Pv. Thus, we need to show 
that there is no v^,!/^ s.t. Pu^ = Pi/^. Define the vector inequality operation 
">" so that X > y iff > yj for all j. Let the T-matrix be the 2^ x L 
matrix T = (<x,[a]): indexed over all response vectors x and equivalence classes 
[a], such that ix,[a] = ^ x|[q; ]). This is a variant of th e T-matrix used to 
examine Q-matrix identifiability in Liu. Xu. and Yin3 ( 2012 '). Then Pu^ ~ Pu^ 



iff Tv^ = Ti/^, and the identifiability condition is equivalent to T being a rank 
L matrix. 

First, suppose that 5 = 0. WLOG, assume that the L equivalence classes 
[cc^], . . . , [a^] are ordered lexicographically by their minimal representatives, 
a^*. Thus, if a''* > a^*, then k > £. Also, let x^ = |(Q, [a^]) for 
£ = 1, . . . ,L. Define T* = (tl^), where = ix^ja*]- Then T* is an L x i 
submatrix of T, containing the specified rows x^, . . . ,x^. Moreover, T* is an 
upper triangular matrix. This is a consequence of the fact that for any k > £, 
a^* ^ a^* . Thus, there must be some item j G {1, . . . , J} for which individuals 
with profiles a. G [oc^] possess the necessary attributes, but individuals with 
profiles a. G [a.^] do not. Then p{Xj = l\[c/]) ~ gj ~ ^ t^^ ^^i] ~ i ~ 0. 
In addition, on the diagonal, = n{r2;^=i}(l ~ *i) 7^ 0. Thus, T* is a rank L 
matrix, as is T . 

Next suppose that g ^ 0- Consider the T matrix as a function of c = 1 — s 
and g. Then the T-matrix T(c, g) can be written as a linear transformation of 
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another T-matrix T(c — g, 0). For any subset of the items S and any constants 
bj, 



n - ^i[b,--Y.9,i[h+ E 9^9k n - • ■ • + (-1)^^ n 9, 

j£S jes jes k^j k^jes e^j.k jes 

For each entry of the T-matrix, the bj will correspond to either Cj or 17^, depend- 
ing on the value of ^j(Q, [a]). Then, T{c—g, 0) = D(g)-T(c, g), where the trans- 
formation matrix D{g) is a 2"^ x 2"^ matrix depending solely on g. Since the rows 
of T are ordered lexicographically by x, D(g) is lower triangular with diagonal 
diag{D) = 1. Thus, D is full-rank and rank(T(c,g)) = rank(T(c - g,0)) = L. 
The model is identifiable, and all other conditions for the consistency of the 
maximum likelihood estimator are clearly evident. □ 

Proof of Proposition [7) Since 



E 

{a:5[c],fc=l} {H:5[c].fc = l} a'eH { [a] ,& =1} 



Cfe can be written in terms of as 

{[Q;]:5[„],fc = l} 

By Theorem [1] the MLE i>[Q,] is consistent as TV — > cx) under the conditions of 
the proposition. Thus, Qk^s consistent as iV — !■ oo. □ 
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